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1 COMMUNICATIONS MONITORING, PROCESSING AND INTRUSION 

2 DETECTION 

3 FIELD OF THE INVENTION 

4 The present invention directed to technology for monitoring 

5 data communications on a network. More particularly, it is 

6 directed to a monitoring technology which can be used to 

7 detect intrusions into or attacks on networks or terminals. 



8 BACKGROUND 

9 A computer network typified by the Internet needs to be 

10 equipped with security measures to prevent the network or 

11 terminals connected to the network from being intruded or 

12 attacked (accessed) without authority. 

13 As network security measures, firewalls are used commonly. 

14 For example, TCP connections are prohibited from passing a 

15 DMZ (De-Militarized Zone) constructed from a firewall on a 

16 boundary between the Internet and an intranet. Thus, direct 

17 connections from the Internet to the intranet can be 
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1 prohibited by setting up firewall rules. 



2 A router which connects networks incorporates a filtering 

3 capability to limit data communications (hereinafter 

4 referred to simply as communications) passing through it. 

5 This capability can be used to prevent unauthorized access 

6 between networks . 

7 There are conventional techniques for tracing unauthorized 

8 access detected on a network. Such conventional techniques 

9 for tracing unauthorized access involves accumulating log 

10 data on communications packets (hereinafter referred to 

11 simply as packets) exchanged over a network in a 

12 predetermined storage (log box) together with their data 

13 size and detection time and tracing unauthorized access, if 

14 detected, by comparing the unauthorized access and 

15 accumulated log information (e.g., See Published Unexamined 

16 Patent Application No. 2001-217834 (pp. 6-8). These 

17 conventional techniques trace unauthorized access offline 

18 using the accumulated log information rather than in real 

19 time. 

20 However, even if security measures such as firewalls and 

21 routers' filtering capabilities are installed on the 
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1 network, it is not possible to prevent intrusions or attacks 

2 made via a host computer placed under the management of the 

3 security measures. 

4 In the example above of installing a DMZ between the 

5 Internet and an intranet, since individual TCP connections 

6 via the DMZ are authorized, firewall rules cannot prohibit 

7 indirect connections from the Internet to the intranet 

8 through a TCP connection set up between the Internet and a 

9 server (e.g., Web server, DNS (Domain Name System) server, 

10 or mail server) in the DMZ and a TCP connection set up 

11 between the server in the DMZ and the intranet. 

12 Also, when filtering capabilities of a router is used to 

13 limit communications, filtering on the router cannot prevent 

14 intrusions made in the following way. Specifically, an 

15 attacker intrudes a computer which will serve as a stepping 

16 stone, erases logs on the computer, and attacks another 

17 computer. As a result, it appears as if the attack were 

18 made from the computer serving as the stepping stone. 

19 Normally, an attacker attacks a target computer via two or 

20 more stepping stones. A computer can be used as a stepping 

21 stone even if it is not intruded itself. The use of a proxy 

22 server for relaying is a case in point. However, even if no 
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1 real damage was done to the computer serving as a stepping 

2 stone, the fact that the computer was used as a stepping 

3 stone will ruin the reputation of the organization that 

4 manages the computer. 

5 since conventional techniques for tracing unauthorized 

6 access detected on a network does tracing through matching 

7 against communications logs, they can trace even 

8 communications conducted via a host computer placed under 

9 the management of security measures as described above. 

10 However, since they perform the matching process offline, 

11 they cannot monitor unauthorized access in real time when 

12 the communications are actually going on. Also, to trace 

13 unauthorized access, it would be advantageous to have 

14 communications logs known to be those of unauthorized 

15 access. 



16 fllMMARY OF THE IN VENTION 

17 Thus, an aspect of the present invention is to make it 

18 possible to monitor communications conducted via a hos 

19 computer as well as communications conducted directly. 
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1 Another aspect of the present invention is to make it 

2 possible to detect intrusions or attacks, including 

3 unauthorized access, in real time through such monitoring 

4 without the need for communications logs known to be those 

5 of unauthorized access. 

6 To achieve the above aspects, the present invention is 

7 implemented as a communications monitoring system 

8 comprising: a packet input means for receiving 

9 communications packets flowing at arbitrary points on a 

10 network, and matching means for performing real-time 

11 matching between two packet streams composed of 

12 communications packets . 

13 In this communications monitoring system, the packet input 

14 means may be a communications sensor connected to 

15 predetermined points (points where communications are to be 

16 monitored) on a network via a network interface while the 

17 matching means may be a similarity calculator which 

18 calculates formal similarity between two packet streams 

19 composed of communications packets entering the sensor upon 

20 arrival of the communications packets. 
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The present invention can also be implemented as a 
communications monitoring method for monitoring data 
communications on a network using a computer, comprising the 
steps of: acquiring communications packets in sequence from 
arbitrary points on a network and storing them in 
predetermined storage means together with information about 
a packet stream to which the communications packets belong; 
on reception of a predetermined communication packet, taking 
another communications packet received within a 
predetermined time before acquiring a predetermined 
communications packet, out of the storage means; determining 
formal similarity between the first packet stream which 
contains up to the acquired communications packet and a 
second packet stream to which the communications packet 
taken out of the storage means belong; and sending out a 
predetermined alert according to the determined similarity. 

Also, the present invention is implemented as an information 
processing method for comparing two packet streams flowing 
on a network, comprising the steps of: acquiring 
communications packets in sequence from arbitrary points on 
a network and storing them in predetermined storage means 
together with information about a packet stream to which the 
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communications packets belong; on reception of a 
predetermined communication packet, taking another 
communications packet received within a predetermined time 
before acquiring a predetermined communications packet, out 
of the storage means; and performing matching between the 
first packet stream which contains up to the acquired 
communications packet and a second packet stream to which 
the communications packet taken out of the storage means 
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10 rr IKF DESCR IPTION OF THF, DRAWINGS: 
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These and other objects, features, and advantages of the 
present invention will become apparent upon further 
consideration of the following detailed description of the 
invention when read in conjunction with the drawing fxgures, 
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15 in which: 

Fig. 1 is a diagram showing a configuration of a computer 
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19 invention; 



which implements a communications monitoring system 
according to an example embodiment of the present 



Fig 2 is a diagram illustrating a functional conf xguratxon 
of the communications monitoring system accordxng to thxs 
embodiment, including the computer, etc. shown xn Fxg. 1; 

Fig . 3 is a diagram expressing two packet streams, observed 
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by a communications sensor according to this embodiment, as 
changes in sequence numbers with respect to time; 

Fig. 4 is a diagram illustrating a packet stream matching 
method according to this embodiment; 

Fig 5 is a flowchart illustrating operation of the 
communications sensor according to this embodiment; 

Fig. 6 is a flowchart illustrating preparation of match 
candidates in Step 503 of Fig. 5; 

Fig 7 is a flowchart illustrating a similarity calculation 
process carried out by a similarity calculator according to 
this embodiment; 

Fig. 8 is a diagram showing a configuration example of a DMZ 
(De-Militarized Zone) ; and 

Fig 9 is a configuration example for use when 
communications among a plurality of networks are monitored 
according to this embodiment. 

DgggRIPT IQN OF SYMBOLS 

10 ... Communications sensor 

20 ... Packet database (DB) 

30 ... Match candidate database (DB) 

40 . . . Similarity calculator . 

50 . . . Candidate discarder 

101 ... CPU 

102 ... Memory 

103 ... Network interface 

104 ... Magnetic disc unit 
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1 DE ggRIPTIQ B ™* THE INVENTION 



2 The present invention provides methods, apparatus and 

3 systems to monitor communications conducted via a host 

4 computer as well as communications conducted directly. The 

5 present invention also enables to detection of intrusions or 

6 attacks, including unauthorized access, in real time through 

7 such monitoring without the need for communications logs 

8 known to be those of unauthorized access. 

9 in an example embodiment, the present invention is 

10 implemented as a communications monitoring system 

11 comprising: a packet input means for receiving 

12 communications packets flowing at arbitrary points on a 

13 network; and matching means for performing real-time 

14 matching between two packet streams composed of 

15 communications packets. 
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in this communications monitoring system, the packet input 
means may be a communications sensor connected to 
predetermined points (points where communications are to be 
monitored) on a network via a network interface while the 
matching means may be a similarity calculator which 
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calculates formal similarity between two packet streams 
composed Of communications packets entering the sensor upon 
arrival of the communications packets. 

in some embodiments, the communications monitoring system 
further comprises alerting means for sending out a 
predetermined alert to an operator, administrative function, 
etc. according to the formal similarity between the two 
packet streams determined by the matching means . 

The formal similarity between two packet streams means 
similarity in the amount of data and transmission interval 
of packets irrespective of data content and is determined 
based on a time lag between each corresponding pair of 
communications packets in the two packet streams. More 
specifically, the two packet streams can be represented by 
graphs depicting amounts of data in communications packets 
in respective packet streams with respect to elapsed time 
and calculates similarity between the two packet streams 
based on size of regions enclosed by the two graphs when the 
graphs of the packet streams are moved close to each other 
without intersecting each other. 



in another embodiment, the present invention can be 
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implemented as a communications monitoring method for 
monitoring data communications on a network using a 
computer, comprising the steps of: acquiring communications 
packets in sequence from arbitrary points on a network and 
storing them in predetermined storage means together with 
information about a packet stream to which the 
communications packets belong; on reception of a 
predetermined communication packet, taking another 
communications packet received within a predetermined time 
before acquiring a predetermined communications packet, out 
of the storage means; determining formal similarity between 
the first packet stream which contains up to the acquired 
communications packet and a second packet stream to which 
the communications packet taken out of the storage means 
belong; and sending out a predetermined alert according to 
the determined similarity. 

in a further embodiment, the communications monitoring 
method further comprises a step of discarding information 
used in determining the similarity of second packet streams 
except the second packet stream determined to be most 
similar to the first packet stream. This makes it possible 
to reduce memory usage and CPU loads on the computer. 



Docket Number JP920020149US1 



- 11 - 



The present invention is also implemented as an information 
processing method for comparing two packet streams flowing 
on a network, comprising the steps of: acquiring 
communications packets in sequence from arbitrary points on 
a network and storing them in predetermined storage means 
together with information about a packet stream to which the 
communications packets belong; on reception of a 
predetermined communication packet, taking another 
communications packet received within a predetermined time 
before acquiring a predetermined communications packet, out 
of the storage means; and performing matching between the 
first packet stream which contains up to the acquired 
communications packet and a second packet stream to which 
the communications packet taken out of the storage means 
belong. 

Advantageously, in the step of calculating the similarity 
between the packet streams, the information processing 
method discards information used in determining the 
similarity if time-axis lengths of the regions enclosed by 
the two graphs are within a specific predetermined range. 
This makes it possible to reduce memory usage and CPU loads 
on the computer . 
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1 Furthermore, the present invention is also implemented as a 

2 program which controls a computer and makes it execute 

3 processes corresponding to the steps of the above described 

4 communications monitoring method or information processing 

5 method as well as a program for making the computer 

6 implement the functions of the above described 

7 communications monitoring system. These programs can be 

8 distributed in a magnetic disk, optical disk, semiconductor 

9 memory, or other recording medium, or delivered via a 

10 network. 

11 The present invention will be described further in detail 

12 below with reference to an embodiment illustrated in the 

13 accompanying drawings. Incidentally, in this example 

14 embodiment, TCP (Transmission Control Protocol) is used as a 

15 network communications protocol. Figure 1 is a diagram 

16 showing configuration of a computer which implements a 

17 communications monitoring system according to this 

18 embodiment. As shown in Figure 1, the computer which 

19 implements this embodiment comprises a CPU 101 which runs 

20 various processes, memory 102 which stores programs for 

21 controlling the CPU 101 and data processed by the CPU 101, 

22 and network interface 103 for inputting packets transmitted 

23 and received over a network. Also, the computer comprises a 
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magnetic disc unit 104 and saves programs and data stored 
the memory 102 to the magnetic disc unit 104 as required. 



3 Figure 2 is a diagram illustrating functional configuration 

4 of the communications monitoring system according to this 

5 embodiment, consisting of the computer, etc. shown in Figure 

6 1. Referring to Figure 2, the communications monitoring 

7 system according to this embodiment comprises a 

8 communications sensor 10, packet database (DB) 20, match 

9 candidate database 30, similarity calculator 40, and 

10 candidate discarder 50. Of these components, the 

11 communications sensor 10, similarity calculator 40, and 

12 candidate discarder 50 are implemented by the 

13 program-controlled CPU 101 of the computer shown in Figure 

14 1. Programs which implement these components can be 

15 distributed in a magnetic disk, optical disk, semiconductor 

16 memory, or other recording medium or delivered via a 

17 network. In the example of Figure 1, they are stored in th« 

18 magnetic disc unit 104, are read into the memory 102, 

19 control the CPU 101, and thereby implement the functions of 

20 the above components. On the other hand, the packet 

21 database 20 and match candidate database 30 are implemented 

22 by the memory 102 and magnetic disc unit 104. 
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In the configuration shown in Figure 2, the communications 
sensor 10 connects, via the network interface 103 shown in 
Figure 1, to predetermined points where packets on the 
network will be monitored, receives flowing packets, and 
stores them in the packet database 20. It is possible to 
connect to any number of points, but communications are 
monitored with respect to packets flowing through two of 
them. If an input packet is a start packet of TCP 
communications, it is stored in the match candidate database 
30. If the input packet is suspected to be from an intruder 
or attacker, an alert is sent out (to an operator, 
predetermined administrative function, etc.). Thus, the 
communications sensor 10 functions as packet input means, 
natch candidate preparation means, and alerting means. 
Detailed operation of the communications sensor 10 will be 
described later. 

The packet database 20 stores information about packets 
.hereinafter referred to as packet information) obtained by 
the communications sensor 10. Packet information contains 
the arrival time and sequence number of a given packet as 
well as packet stream information. A packet stream consists 
of packets flowing in one direction out of packets exchanged 
in one TCP communications session. Packet stream 
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1 information contains a set of four items—source IP address, 

2 destination IP address, source port number, and destination 

3 port number— which represent a TCP connection as well as 

4 orientation with respect to the TCP connection (whether the 

5 packet stream is oriented in the same direction or opposite 

6 direction to the TCP connection) . The packet information 

7 can be obtained from the acquired packet itself, its header 

8 information, etc. As database access capabilities, the 

9 packet database 2 0 has capabilities to: 

10 i. retrieve a list of relevant packets using packet 

11 stream information as an index, and 

12 2 . retrieve packets in time order . 

13 The match candidate database 30 stores match candidates for 

14 use in packet stream matching described later. A match 

15 candidate is a data structure which is used to hold 

16 in-progress reports on calculation of similarity between tw< 

17 TCP communications sessions (described later) and consists 

18 of two packet streams, sequence number offsets, and 

19 similarity. Thus, the match candidate database 30 contains 

20 the following information. 

21 • Packet stream to be checked 

22 • Packet stream for comparison 
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1 • Sequence number offset 

2 • Similarity information (area, maximum length, and 

3 minimum length: details will be described later) . 

4 Also, as database access capabilities, the match candidate 

5 database 30 has capabilities to retrieve match candidates 

6 using the packet stream for comparison as an index. 

7 The similarity calculator 40 is a matching means which 

8 acquires a match candidate packet stream from the match 

9 candidate database 30, compares it with a packet stream 

10 acquired by the communications sensor 10, and calculates 

11 formal similarity (similarity in the amount of data and 

12 transmission interval of packets irrespective of data 

13 content) between the packet streams under instructions from 
the communications sensor 10. The concept and calculation 



14 



15 method of similarity will be described in detail later. 

16 To avoid explosion in the number of match candidates for a 

17 predetermined packet stream, the candidate discarder 50 

18 erases (discards) match candidates in the match candidate 

19 database 3 0 as required. 

20 Next, operation of this embodiment will be described 
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assuming a concrete situation in which network 
communications need to be monitored by this embodiment. 



3 Figure 8 is a diagram showing a configuration example of a 

4 DMZ (De-Militarized Zone). As shown in Figure 8, the DMZ 

5 810 has been configured such that inner servers (Web server 

6 813, DNS server 814, and mail server 815) can be accessed 

7 from the Internet 820 and an intranet 830 only through a 

8 firewall 811 or 812. The existence of the DMZ 810 makes it 

9 possible to pass only the traffic which is based on HTTP 

10 (HyperText Transfer Protocol), SMTP (Simple Mail Transfer 

11 Protocol), or other accepted communications protocols. In 

12 Figure 8, rules have been set up for the firewall 811 to 

13 allow access to the Web server 813 from the Internet 820 

14 while rules have been set up for the firewall 812 to allow 

15 access to the Web server 813 from the intranet 830. This 

16 allows for e-mail delivery from the Internet 820 to the 

17 intranet 830, for example. In this case, TCP communications 

18 along an intrusion route (indicated by arrows in Figure 8) 

19 running from the Internet 820 to a server in the DMZ 810 

20 (e.g., the Web server 813 in Figure 8) and then from the 

21 server in the DMZ 810 to the intranet 830, as with the 

22 e-mail delivery route above, are difficult to detect with 

23 intrusion detection tools of the firewalls 811 and 812 or 
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the like because they comply with the rules of the fi 
811 and 812. 



3 Computers which use a network provided by a service provider 

4 (ISP: internet Service Provider) can be used as stepping 

5 stones for similar intrusions. Filtering on a router cannot 

6 prevent such computers from being used as stepping stones. 

7 to prevent intrusions and attacks which use secure host 

8 computers on a network as relay points, this embodiment 

9 advantageously monitors communications at points which could 

10 be used as relay points. To acheive this, for example, 

11 in the communications monitoring system shown in Figure 2, 

12 the communications sensor 10 is connected to desired points 

13 via the network interface 103. Specifically, to monitor TCP 

14 communications conducted via the DMZ 810 shown in Figure 8, 

15 the communications sensor 10 is connected to an interface of 

16 the firewall 811 on the side of the DMZ 810 and an interface 

17 of the firewall 812 on the side of the intranet 830. To 

18 watch for any use of a host computer as a stepping stone, 

19 the communications sensor 10 is connected to Internet-side 

20 interfaces of the router. Incidentally, the communications 

21 sensor 10 can make as many connections as there are points 

22 to be monitored, but detection results are given as 
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1 similarity between packet streams at two points. 

2 If it is detected that very similar packet streams are 

3 transmitted (TCP communications are conducted) at the two 

4 points at a small time interval, it is likely that the 

5 network or a system connected to the network is intruded. 

6 in such a case, according to this embodiment, a warning is 

7 issued to prompt the operator and the like to take necessary 

8 measures . 

9 Now, similarity between two packet streams will be 

10 described. 

11 Figure 3 is a diagram expressing two packet streams, 

12 observed by the communications sensor 10, as changes in 

13 sequence numbers with respect to time. The sequence number 

14 represents data transmitted so far by TCP communications. A 

15 random number is used as an initial value and the sequence 

16 number is incremented by the amount of data transmitted. In 

17 Figure 3, increases in sequence numbers (i.e., amount of 

18 data transmitted) are graphed with respect to time. The 

19 shapes of the graphs in Figure 3 are considered to represent 

20 formal characteristics of the packet streams. Thus, the 

21 similarity between two packet streams according to this 

- 20 - 
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1 embodiment is defined as the minimum X-axis (time axis) 

2 dimension (time lag) divided by the Y-axis dimension (amount 

3 of transmitted data represented by the sequence number) of 

4 regions enclosed by the two graphs when the graphs of the 

5 packet streams are moved close to each other without 

6 intersecting each other. In other words, the similarity 

7 between the two packet streams is determined based on the 

8 time lag between each corresponding pair of communications 

9 packets in the two packet streams. Thus, similarity 

10 information about match candidates stored in the match 

11 candidate database 3 0 is represented by the total area of 

12 the regions enclosed by the graphs of the two packet 

13 streams, maximum length in the X-axis direction, and minimum 

14 length in the X-axis direction (hereinafter parameters which 

15 represent the size of these areas are simply called as the 

16 area, maximum length, and minimum length) . 

17 When the similarity between two packet streams is defined in 

18 this way, the similarity is not supposed to be calculated 

19 until communications in one direction are finished. 

20 However, to check two packet streams for a match in real 

21 time, it is not desirable that loads are concentrated at the 

22 end of communications. Thus, according to this embodiment, 

23 a similarity candidate is calculated little by little each 
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1 time a packet is received. 



2 Figure 4 is a diagram illustrating a packet stream matching 

3 method according to this embodiment. 

4 Since random numbers are used as the initial values of 

5 sequence numbers as described above, for the sake of 

6 comparison, sequence number offsets are determined for all 

7 packet streams to be checked at the start of the packet 

8 stream for comparison. Then, match candidates which contain 

9 information about the packet streams to be checked and 

10 sequence number offsets are prepared. Incidentally, at the 

11 start of the packet stream, initial values (described later) 

12 are used as similarity information about the first packet of 

13 the packet stream for comparison. 

14 As the packet stream for comparison advances as packets are 

15 received one after another, changes in the similarity of 

16 match candidates are calculated. The calculation of changes 

17 include calculating the area and maximum length and minimum 

18 length in the X-axis direction of a region newly enclosed as 

19 the graphs progress. The similarity of each match candidate 

20 is given by the following formula. According to this 

21 method, since only the changes resulting from the progress 
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of the graphs need to be calculated, computational loads can 
be distributed. 

Similarity = min [ | area-minimum length * height |, 
| area-maximum length * height | ] /height 

The processing time and memory required by the 
communications monitoring system is proportional to the 
number of match candidates. The number of match candidates 
is given by 

0[ (number of packet streams 2 ) * (number of packets in a 
packet stream) ] 

Thus, for real-time processing, match candidates must be 
reduced as required in the process of calculating 
similarity. For that purpose, this embodiment uses time 
lags between two TCP communications. 

Now lets consider time lags between two TCP communications 
which are checked for a match according to this embodiment, 
in the case of communications from the Internet to an 
intranet via a server in the DMZ 810 in Figure 8, for 
example, a time lag is caused as the communications are 
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conducted via one or two host computers. In the case of 
communications in which a network provided by an ISP is used 
as a stepping stone, a time lag is caused as the 
communications are conducted via a small network. These 
time lags are not more than half the command response time 
on the terminal of the intruder. Thus, it is assumed that 
the time lag between the two TCP communications which are 
checked for a match can be kept under approximately 1 to 2 
seconds (this time lag is referred to as the maximum packet 
delay time) . 

Thus, during match candidate preparation, if the lag between 
the arrival times of corresponding packets in two packet 
streams is larger than the maximum packet delay time, no 
match candidate is prepared. Also, when updating a match 
candidate upon reception of a packet, if the maximum length 
in the X-axis direction shown Figure 4 exceeds the maximum 
packet delay time or the minima™ length is smaller than the 
negative value of the maximum packet delay time (negative 
maximum packet delay time) , the candidate discarder 50 
erases the match candidate from the match candidate database 
30. These processes make it possible to reduce memory usage 
and calculation time in the communications monitoring 
system. 
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Figure 5 is a flowchart illustrating operation of the 
communications sensor 10 under the. above circumstances. As 
shown in Figure 5, the communications sensor 10 receives a 
packet via the network interface 103 and stores it in the 
packet database 20 (Step 501) . If the packet is the start 
packet of a packet stream (start of TCP communications), the 
communications sensor 10 prepares a match candidate 
according to procedures described later and stores it in the 
match candidate database 30 (Steps 502 and 503). 

On the other hand, if the packet received is not the start 
packet of a packet stream, the communications sensor 10 
performs packet stream matching. Specifically, first, using 
the packet stream (packet stream for comparison) which 
contains the received packet as an index, the communications 
sensor 10 takes match candidates out of the match candidate 
database 30 (Steps 502 and 504). Then, the similarity 
calculator 40 calculates similarity for each of the match 
candidates (Step 505). Processes of the similarity 
calculator 40 will be described later. 

Next, based on the output from the similarity calculator 40, 
the communications sensor 10 designates the lowest 
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similarity among the similarities of the match candidates as 
similarity M (Step 506) . If the number of packets in the 
packet stream for comparison is larger than a preset 
threshold (i.e., the packet stream for comparison is equal 
to or longer than a fixed length) and the similarity M is 
lower than a preset threshold, the communications sensor 10 
determines that an intrusion has been detected (Steps 507 
and 508). Then, the communications sensor 10 sends out the 
information about the match candidate as warning information 
(Step 509) . 

After Step 506, the communications sensor 10 instructs the 
candidate discarder 50 to erase the match candidates except 
the one with the similarity M from the match candidate 
database 30 (to reduce match candidates). 

Figure 6 is a flowchart illustrating preparation of match 
candidates in Step 503. As shown in Figure 6, the 
communications sensor 10 designates the packet (start 
packet) received at the beginning of a packet stream as the 
packet stream for comparison (Step 601) . Then, the 
communications sensor 10 takes packets out of the packet 
database 20 in reverse chronological order (Step 602). If 
the time lag between the packets taken out of the packet 
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1 database 20 and the packet stream for comparison (received 

2 packet) exceeds the maximum packet delay time, the 

3 communications sensor 10 finishes processing (Step 603). 

4 on the other hand, if the time lag between the packets taken 

5 out of the packet database 20 and the packet stream for 

6 comparison does not exceed the maximum packet delay time, 

7 the communications sensor 10 retrieves the packet stream to 

8 which the packets belong from the packet database 20 and 

9 designates it as a packet stream to be examined (Steps 603 

10 and 604) . Then, the communications sensor 10 designates the 

11 starting sequence number of the retrieved packets as offset 

12 information (Step 605) and sets initial values of similarity 

13 information as follows: area = 0, maximum length =0, and 

14 minimum length = oo (Step 606) . Next, the similarity 

15 calculator 40 calculates similarity (Step 607). Then the 

16 flow returns to Step 602 and the above processes are 

17 repeated until there is no more packet that would cause a 

18 smaller time lag with respect to the packet stream for 

19 comparison than the maximum packet delay time. 

20 Through the above processes, as many match candidates are 

21 prepared as there are packets which cause a smaller time lag 
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1 with respect to the packet stream for comparison than the 

2 maximum packet delay time. 

3 incidentally, in the above processes, no distinction is made 

4 between the two points connected with the communications 

5 sensor 10 in terms of which of them the packet stream for 

6 comparison and the packet stream to be checked are received 

7 from. Directions of communications are not distinguished 

8 either. Thus, according to this embodiment, matching is 

9 performed and similarity is calculated when two packet 

10 streams are obtained, with no distinction between the two 

11 points. This makes it possible to detect attacks which 

12 accesses the network through one communications path. 

13 Figure 7 is a flowchart illustrating a similarity 

14 calculation process carried out by the similarity calculator 

15 40. This process is performed under instructions from the 

16 communications sensor 10 using graphs such as those shown in 

17 Figure 3 (see Step 505) each time the packet stream for 

18 comparison increases by one packet. The region newly 

19 enclosed by the new portions of the graphs as the packet is 

20 added is designated as region B and the area, maximum 

21 length, and minimum length of region B is determined (Step 

22 701) . Consequently, the similarity information about the 
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1 two packet streams is updated as required (Step 702) . 

2 Specifically, the updated area is given by 

3 Area = area just before update + area of region B 

4 If the maximum length of region B is larger than the maximum 

5 length just before the update, the updated maximum length is 

6 given by 

7 Maximum length = maximum length of B 

8 If the minimum length of region B is larger than the minimum 

9 length just before the update, the updated minimum length is 

10 given by 

11 Minimum length = minimum length of B 

12 Then, the similarity calculator 40 calculates similarity 

13 using the above parameters and passes the calculated 

14 similarity to the communications sensor 10 (Step 703). 

15 If the maximum length of region B obtained in Step 701 

16 exceeds the maximum packet delay time or the minimum length 

17 of region B is smaller than the negative value of the 
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maximum packet delay time, the similarity calculator 40 
instructs the candidate discarder 50 to erase the match 
candidate from the match candidate database 30 (to reduce 
match candidates) . 

Next, description will be given of an application example of 
this embodiment as applied to a concrete network system. 

Figure 9 is a configuration example for use when 
communications among a plurality of networks are monitored 
according to this embodiment. Suppose a plurality of 
networks 910, 920, and 930 are connected via routers 901 and 
902 as shown in Figure 9. In comparison to Figure 8, it can 
be assumed that the networks 910, 920, and 930 correspond to 
the internet 820, DMZ 810, and intranet 830, respectively, 
while the routers 901 and 902 correspond to the firewalls 
811 and 812, respectively. When watching for any use of a 
host computer as a stepping stone, a computer in the network 
920 can be assumed to be a host computer 921. 

in a network system configured as shown in Figure 9, an 
attacker passes through the network 910 and first attacks 
the computer 921 in the network 920. Furthermore, it is 
assumed that by attacking a security hole in the computer 

- 30 - 
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921, the attacker succeeds in attacking a computer 931 in 
the network 930. If it is assumed that the attacker 
communicates along the paths indicated by the solid arrows 
in Figure 9, the packets related to this communication 
should pass through the router 901 and router 902. 

The communications monitoring system according to this 
embodiment normally watches communications flowing through 
the router 901 and router 902, and thus instantly detects 
the communications initiated by the attacker. It reports 
the detected attack to an external administrative function. 

in the example of Figure 9, communications at two points are 
monitored by this embodiment, but it is also possible to 
monitor more than two points. In that case, relationship 
between communications at any two of the points is detected 
in real time. As a special case, the network 910 and 
network 930 may constitute an identical network. In that 
case, the communications sensor 10 is connected to one 
point, and this embodiment operates as a system which 
detects attacks made via the single network 920. 

incidentally, in the embodiment described above, TCP is used 
as a network communications protocol, but this embodiment is 
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1 applicable to other communications protocols as well. It 

2 can be applied to network communications based on UDP or 

3 other protocols as appropriate. When applying this 

4 embodiment to another protocol, packet stream information 

5 used to search the packet database 20 and parameters used to 

6 calculate similarity are specified according to the packet 

7 format of the given protocol. 

8 For example, if the communications protocol used is UDP, 

9 packet stream information contains a set of four 

10 items-source IP address, destination IP address, source 

11 port number, and destination port number while they are 

12 sorted in order of arrival time and UDP data size. When UDP 

13 is used, no such information as sequence numbers of TCP is 

14 available to calculate the total amount (bytes) of data 

15 flowing in each stream up to a certain time point. 

16 Therefore, "UDP data size" in a UDP header is used to 

17 calculate the total amount of transmitted data, which in 

18 turn is used as the Y-axis (vertical axis) of graphs in 

19 similarity calculation. Regarding the total amount of data 

20 flowing in each stream up to a certain time point, it can be 

21 calculated by totaling past UDP data sizes. 
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1 Thus as described above, the present invention makes it 

2 possible to monitor communications conducted via a host 

3 computer as well as communications conducted directly. 

4 Also, the present invention makes it possible to detect 

5 intrusions or attacks, including unauthorized access, in 

6 real time through such monitoring without the need for 

7 communications logs known to be those of unauthorized 

8 access . 

9 variations described for the present invention can be 

10 realized in any combination desirable for each particular 

11 application. Thus particular limitations, and/or embodiment 

12 enhancements described herein, which may have particular 

13 advantages to the particular application need not be used 

14 for all applications. Also, not all limitations need be 

15 implemented in methods, systems and/or apparatus including 

16 one or more concepts of the present invention. 

17 The present invention can be realized in hardware, software, 

18 or a combination of hardware and software. A visualizatior 

19 tool according to the present invention can be realized in c 
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centralized fashion in one computer system, or in a 
distributed fashion where different elements are spread 
across several interconnected computer systems . Any kind of 
computer system - or other apparatus adapted for carrying 
out the methods and/or functions described herein - is 
suitable. A typical combination of hardware and software 
could be a general purpose computer system with a computer 
program that, when being loaded and executed, controls the 
computer system such that it carries out the methods 
described herein. The present invention can also be 
embedded in a computer program product, which comprises all 
the features enabling the implementation of the methods 
described herein, and which - when loaded in a computer 
system - is able to carry out these methods. 

Computer program means or computer program in the present 
context include any expression, in any language, code or 
notation, of a set of instructions intended to cause a 
system having an information processing capability to 
perform a particular function either directly or after 
conversion to another language, code or notation, and/or 
reproduction in a different material form. 
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1 Thus the invention includes an article of manufacture which 

2 comprises a computer usable medium having computer readable 

3 program code means embodied therein for causing a function 

4 described above. The computer readable program code means 

5 in the article of manufacture comprises computer readable 

6 program code means for causing a computer to effect the 

7 steps of a method of this invention. Similarly, the present 

8 invention may be implemented as a computer program product 

9 comprising a computer usable medium having computer readable 

10 program code means embodied therein for causing a a function 

11 described above. The computer readable program code means 

12 in the computer program product comprising computer readable 

13 program code means for causing a computer to effect one or 

14 more functions of this invention. Furthermore, the present 

15 invention may be implemented as a program storage device 

16 readable by machine, tangibly embodying a program of 

17 instructions executable by the machine to perform method 

18 steps for causing one or more functions of this invention. 

19 It is noted that the foregoing has outlined some of the more 

20 pertinent objects and embodiments of the present invention. 

21 This invention may be used for many applications. Thus, 

22 although the description is made for particular arrangements 
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1 and methods, the intent and concept of the invention is 

2 suitable and applicable to other arrangements and 

3 applications. It will be clear to those skilled in the art 

4 that modifications to the disclosed embodiments can be 

5 effected without departing from the spirit and scope of the 

6 invention. The described embodiments ought to be construed 

7 to be merely illustrative of some of the more prominent 

8 features and applications of the invention. Other 

9 beneficial results can be realized by applying the disclosed 

10 invention in a different manner or modifying the invention 

11 in ways known to those familiar with the art. 
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